Introduction

Motivation

Many resources for carbonate chemistry focus on modern seawater-like solutions.

Consider the problem of finding the carbonate ion concentration, [CO\(_3^{2-}\)], given dissolved inorganic carbon (DIC) and pH. The first chapter of the your textbook, CO\(_2\) in Seawater: Equilibrium, Kinetics, and Isotopes, derives the following expression:

\[[CO_3^{2-}] = \frac{DIC}{1 +\frac{K_1^*}{[H^+]} + \frac{K_1^*K_2^*}{[H^+]^2}}\]

To solve this, we need to select appropriate values for the dissociation constants. Again, the text helpfully provides a list of experimentally-determined values for pK\(_1^*\) and pK\(_2^*\):

T(\(^\circ\)C) Composition P(atm) pK\(_1^*\) pK\(_2^*\) Source
25 freshwater 1 6.35 10.33 Usdowski (1982)
25 seawater 1 5.86 8.92 DOE (1994)
0 seawater 1 6.11 9.38 DOE (1994)

But many geological problems involve fluids that are not like modern seawater. Consider carbonate sediments forming in the Great Salt Lake, Utah. Given pH and DIC, how would we calculate [CO\(_3^{2-}\)]? This is not an easy question as it depends on what is an “appropriate” set of constants K\(_1^*\) and K\(_2^*\). We cannot simply look them up on a table unless someone has already determined them using water from Great Salt Lake.

What is PHREEQC?

Thermodynamic properties such as dissociation constants can be expressed either in terms of stochiometry or activity:

\[ K_2^* = K_2 \frac{\gamma_{ HCO_3^-}}{\gamma_{H^+}\gamma_{CO_3^{2-}}}\] where

\[ \gamma_a = f(all\: ions\: in\: solution) \]

Most programs use either the defintions on one side or the other. Program CO2sys uses stoichometric constants (left-hand side) for seawater. In contrast, programs such as EQ3/6, PHREEQC, and the Geochemist’s workbench use activity-based definitions (right-hand side). These programs use an algorithm to estimate activities based on all ions in solution. For our purposes, this will allow us to solve carbonate chemistry problems when we can’t look up K\(_1\) and K\(_2\) on a table.

In addition, PHREEQC also handles calculations involving non-carbonate minerals, redox speciation, isotopes, and even kinetics. We won’t cover these here, but it’s good to know we have that flexibility in the future.

PHREEQC was developed and maintained by the US Geological Survey (USGS) who still regularly update their website here. The website includes a user’s guide including examples which are pre-loaded in the phreeqc package in R. One of the creators, David Parkhurst, has posted many answers to user questions on forums. If you have having trouble, google your problem. There is a good chance one of the programmers has posted a solution.

What is R?

R is a free programming language that specializes in graphics and statistics. It also has an active community of practicioners who maintain very helpful web resources. Googling an unfamiliar command or function will usually return either official documentation or good forum discussions. Perhaps my favorite resource is the R graph gallery. This site has a wealth of data display options; you just find one you like, and the code to generate it is at the bottom of the page!

I’ve done by best to integrate some of these online resources. Clicking on blue text in this document will direct you to helpful resources for specific commands or topics.

Nearly all of the scripts presented here share the same three-part structure. The first section is written in R and deals with loading, selecting, or otherwise preparing data so that PHREEQC can read it. The middle part is mostly written around performing calculations in PHREEQC and storing the results as R variables. The final section is, again, mostly in R and deals with displaying the results. After a few modules, you will notice that most scripts will have at least 50% of their lines recycled from previous examples. This is by design; when creating a new diagram, it’s often best to modify an existing script rather than creating an entirely new one from scratch.

Overview

In these modules we will learn how to:

  • Translate simple problems into PHREEQC syntax
  • Control how PHREEQC returns information to R
  • Work with common inputs such as pCO\(_2\), total alkalinity, total carbon, and pH
  • Convert among pH scales
  • Use a loop to run multiple calculations in sequence

Our ultimate goal is to perform multiple calculations to create graphs such as the Bjerrum plot below. Such diagrams are invaluable problem-solving tools for carbonate chemistry, and R/PHREEQC allows users to adapt them on the fly to get real-time feedback for problem solving.

Figure 1: an interactive Bjerrum plot for 2 mmol total in carbon in seawater at 25C. Clicking and draging over an area of the plot will zoom in; double clicking resets the axes. Individual curves can be turned on and off by clicking on them in the legend.

Before we begin

You will need to download versions of R and R Studio that are compatible with your operating system. Once these programs are installed, you will need to install two additional packages:phreeqc and plotly. These can be downloaded by navigating to Tools>Install Packages and searching for them.

Module 1: Basic R and PHREEQC syntax

Basic PHREEQC inputs and outputs

As discussed by Zeebe and Wolf-Gladrow, (2001), the carbonate system has two degrees of freedom among six parameters: CO\(_2\), HCO\(_3^-\), CO\(_3^{2-}\), pH, total carbon (DIC), and alkalinity. The most basic use for PHREEQC is to input two paramaters of the carbonate system and and solve the rest. All that PHREEQC requires is that the relevant information be entered into a block called SOLUTION. PHREEQC will then calculate as many relevant properties of the solution as possible, including activities, molalities of free and complexed ions, and the solubilities of common mineral and gasses.

Example 1: Using pH and DIC

Calculate the total alkalinity for the following solution at 25\(^\circ\)C. Units are given in mmol/kgw. Print the output as list.

DIC = 2.1
pH = 8.2
Ca = 10

The script below translates these constraints into “PHREEQC speak” and prints the output as text. Concentrations are specified by an abbreviation for the relevant element (e.g., Ca). For elements with multiple valence states, the abbreviation is followed by the valence in parentheses. In PHREEQC, the abbreviation for DIC is C(4): i.e., the total amount of inorganic carbon with four valence electrons.

##  [1] "------------------------------------"                                            
##  [2] "Reading input data for simulation 1."                                            
##  [3] "------------------------------------"                                            
##  [4] ""                                                                                
##  [5] "\t  SOLUTION              "                                                       
##  [6] "\t  units         mmol/kgw"                                                       
##  [7] "\t  temp                25"                                                       
##  [8] "\t  pH                 8.2"                                                       
##  [9] "\t  C(4)               2.1"                                                       
## [10] "\t  Ca                10.7"                                                       
## [11] "-------------------------------------------"                                     
## [12] "Beginning of initial solution calculations."                                     
## [13] "-------------------------------------------"                                     
## [14] ""                                                                                
## [15] "Initial solution 1.\t"                                                            
## [16] ""                                                                                
## [17] "-----------------------------Solution composition------------------------------" 
## [18] ""                                                                                
## [19] "\tElements           Molality       Moles"                                        
## [20] ""                                                                                
## [21] "\tC(4)             2.100e-003  2.100e-003"                                        
## [22] "\tCa               1.070e-002  1.070e-002"                                        
## [23] ""                                                                                
## [24] "----------------------------Description of solution----------------------------" 
## [25] ""                                                                                
## [26] "                                       pH  =   8.200    "                        
## [27] "                                       pe  =   4.000    "                        
## [28] "      Specific Conductance (uS/cm,  25oC)  = 1085"                               
## [29] "                          Density (g/cm3)  =   0.99774"                          
## [30] "                               Volume (L)  =   1.00283"                          
## [31] "                        Activity of water  =   1.000"                            
## [32] "                 Ionic strength (mol/kgw)  =  2.248e-002"                        
## [33] "                       Mass of water (kg)  =  1.000e+000"                        
## [34] "                 Total alkalinity (eq/kg)  =  2.103e-003"                        
## [35] "                       Total CO2 (mol/kg)  =  2.100e-003"                        
## [36] "                         Temperature (oC)  =  25.00"                             
## [37] "                  Electrical balance (eq)  =  1.930e-002"                        
## [38] " Percent error, 100*(Cat-|An|)/(Cat+|An|)  =  82.11"                             
## [39] "                               Iterations  =   6"                                
## [40] "                         Gamma iterations  =   3"                                
## [41] "                      Osmotic coefficient  =   0.83426"                          
## [42] "                         Density of water  =   0.99704"                          
## [43] "                                  Total H  = 1.110145e+002"                      
## [44] "                                  Total O  = 5.551249e+001"                      
## [45] ""                                                                                
## [46] "----------------------------Distribution of species----------------------------" 
## [47] ""                                                                                
## [48] "                                                    MacInnes  MacInnes"          
## [49] "                                MacInnes       Log       Log       Log    mole V"
## [50] "   Species          Molality    Activity  Molality  Activity     Gamma   cm3/mol"
## [51] ""                                                                                
## [52] "   OH-            2.006e-006  1.604e-006    -5.698    -5.795    -0.097     -3.97"
## [53] "   H+             7.368e-009  6.310e-009    -8.133    -8.200    -0.067      0.00"
## [54] "   H2O            5.551e+001  9.998e-001     1.744    -0.000     0.000     18.07"
## [55] "C(4)         2.100e-003"                                                         
## [56] "   HCO3-          2.049e-003  1.828e-003    -2.688    -2.738    -0.050     24.89"
## [57] "   CO3-2          2.590e-005  1.326e-005    -4.587    -4.877    -0.291     -3.30"
## [58] "   CO2            2.522e-005  2.532e-005    -4.598    -4.597     0.002     34.43"
## [59] "Ca           1.070e-002"                                                         
## [60] "   Ca+2           1.070e-002  5.957e-003    -1.971    -2.225    -0.254    -17.79"
## [61] ""                                                                                
## [62] "------------------------------Saturation indices-------------------------------" 
## [63] ""                                                                                
## [64] "  Phase               SI** log IAP   log K(298 K,   1 atm)"                      
## [65] ""                                                                                
## [66] "  Aragonite         1.12     -7.10   -8.22  CaCO3"                               
## [67] "  Calcite           1.40     -7.10   -8.50  CaCO3"                               
## [68] "  CO2(g)           -3.13     -4.60   -1.47  CO2"                                 
## [69] "  H2O(g)           -1.50     -0.00    1.50  H2O"                                 
## [70] "  Portlandite      -8.62    -13.81   -5.19  Ca(OH)2"                             
## [71] ""                                                                                
## [72] "**For a gas, SI = log10(fugacity). Fugacity = pressure * phi / 1 atm."           
## [73] "  For ideal gases, phi = 1."                                                     
## [74] ""                                                                                
## [75] "------------------"                                                              
## [76] "End of simulation."                                                              
## [77] "------------------"                                                              
## [78] ""                                                                                
## [79] "------------------------------------"                                            
## [80] "Reading input data for simulation 2."                                            
## [81] "------------------------------------"                                            
## [82] ""                                                                                
## [83] "-------------------------------"                                                 
## [84] "End of Run after 4.513 Seconds."                                                 
## [85] "-------------------------------"                                                 
## [86] ""

The commands phrSetOutputStringsOn(TRUE) and phrGetOutputStrings() tell PHREEQC to return the output as a list (character strings). Line 34 has our answer: Total alkalinity (eq/kg) = 2.103e-003. Note that even though we used mmol/kgw as our input units, the outputs default to mol/kgw.

The list format gives a good overview of what PHREEQC is capable of. Here are some other notable outputs and where to find them:

  • The molality and activity of free hydrogen (line 53). This information will help us when we discuss pH scales in the next module.

  • The speciation of carbon among CO\(_{2(aq)}\), HCO\(^-_{3}\), and CO\(^{2-}_{3}\) (lines 56-59).

  • Saturation index (described below) and K\(_{sp}\) values for mineral phases such as aragonite and calcite (lines 66-67)


Parsing answers with SELECTED_OUTPUT

The list view shows us the whole range of outputs PHREEQC can calculate just using the SOLUTION block. However, what we really want is to select 2-3 outputs from the list and save them so we can plot them later. The SELECTED_OUTPUT block returns only the values specifically requested by the user. Additionally, it also formats values so that R reads them as numbers rather than text. These data formats make it easier to perform other operations, like plotting the results on a graph.

Example 2: Using total alkalinity and DIC

Calculate pH for the following solution at 25\(^\circ\)C. Units are given in mmol/kgw. Only print pH in the output.

DIC = 2.1
alkalinity = 2.1
Ca = 10.7

## [1] "The pH is 8.18"

Let’s take a closer look at how R stores the results for phrGetSelectedOutput(). Look at the top-right window in R studio. You should see a header that says ‘Global Environment’ with subheaders such as ‘data’ and ‘values’. After we run the code, an object named output will appear with the description ‘List of 1’. Clicking on output will open it in a new window where you can navigate through the list using the blue arrows. This is a nifty way to check what PHREEQC has given you and where it is stored.

The last two lines show how to access and save information from the output. R uses the $ character to navigate through nested objects. R reads the syntax output$n1$pH as follows: "find the object named output, navigate to a subdivision called n1, and find the element(s) named pH. The command <- stores whatever is on the righthand side as a variable (or set of variables). So the syntax pH <- output$n1$pH tells R to navigate somewhere and store the relevant value(s) as a new variable named pH_out.

As a side note, this example shows two things I absolutely love about R. The first is that the syntax navigates by the names of objects. This makes R scripts easier to comprehend than some programming languages so long as you assign descriptive variable names. The second is that R has smartphone-level awareness of what you are trying to do. Try this: start a new line at the end of the example above and type output$. If you already have output in the Global Environment, R will automatically bring up a drop-down menu with n1. If you continue with output$n1$, R will again bring up a list of all objects within n1, inlcuding pH. This is a great quality-of-life perk.


Working with pCO\(_2\) and saturation states

In the previous examples we solved the carbonate system using some combination of pH, alkalinity, and DIC. Another variable, pCO\(_2\), appears in labarotory experiments and in the context of paleoclimate. Many laboratory experiments ‘set’ carbonate chemistry by circulating gas through the tank to mainatin a constant pCO\(_2\) in the headspace. Such studies will then report ambient pCO\(_2\) and pH conditions, which are enough to solve the rest of the carbonate variables. Paired pCO\(_2\) and pH values also appear in papers about Earth’s atmospheres and oceans. Let’s try an example with pCO\(_2\) and pH as inputs.

Example 3: Using pH and pCO\(_2\)

Calculate the calcite saturation state (\(\Omega\)) for the following solution at 25\(^\circ\)C.

pH = 8
CO\(_{2(g)}\) = 10\(^{-4}\) atm
Ca = 10.7 mmol/kgw

We need to know more about how PHREEQC handles saturation with respect to solids and gases. In PHREEQC, the shorthand for CO\(_2\) partial pressure is ‘CO2(g)’ followed by log\(_{10}\)(pressure). Note that unlike dissolved ions, we cannot define the pressure units in the SOLUTION block; the default is always atm (at least as far as I know). Please keep this in mind as you will often see data reported in other units such as bars, ppm, mmHg, etc.

Solid phases are depicted in terms of their saturation index. It is related to the more common saturation state, \(\Omega\), by:

\[ SI = log_{10}(\Omega) \] The script below formats the inputs in terms of pH and pCO2:

## [1] "The calcite saturation state is 5.08"

This syntax is not intuitive. First of all, why do we have C(4) when we were not given DIC in the problem? And why is it followed by 2 (presumably mmol/kgw?). And what’s the rationale for putting both DIC and CO2(g) on the same line?

The best answer I can offer is that CO2(g) is a property of the gas in contact with the solution, not the solution itself. As such, PHREEQC does not recognize it as a valid command in the SOLUTION block. The workaround is to select another parameter, such as DIC, and ask PHREEQC to change it until the solution is in equilibrium with CO2(g). The first item is a property that belongs in the INPUT block, e.g., a total concentration of an element (DIC) or pH. The second one is usually a mineral or gas phase. The number in the middle (e.g., ‘2’ in the example above) tells PHREEQC to start ‘guessing’ at an initial value and adjust it from there. In words, the above syntax tells PHREEQC: “Hold pH at 8 but adjust DIC until the solution is in equilibrium with 10\(^{-3.4}\) atm CO\(_2\). Use DIC = 2 mmol/kgw as an initial guess”

To demonstrate, let’s go back to Example 3 and try different values for the number between ‘C(4)’ and ‘CO2(g)’. The answer (usually) won’t change: PHREEQC is pretty good at converging on the correct answer even if the intial guess is bad.


Summary

The PHREEQC package in R can be called with relatively few command lines. As long we provide the SOLUTION block with two carbonate parameters, it will calculate the remaining four. We can also get Concentrations, activities, and activity coefficients for CO\(_{2(aq)}\), HCO\(^-_{3}\), and CO\(^{2-}_{3}\). PHREEQC can also handle saturations for gas and mineral phases, although there is a special syntax if these are specified in SOLUTION. The SELECTED_OUTPUT block allows us to better control how information is passed from PHREEQC back to R. In the next section, we will focus on R syntax that will allow us to run multiple calculations in PHREEQC efficiently.


References

Zeebe, Richard E., and Dieter Wolf-Gladrow. CO\(_2\) in seawater: equilibrium, kinetics, isotopes. No. 65. Gulf Professional Publishing, 2001.

Module 2: Converting between pH scales

Overview

It may be surprising to learn that there is no unified pH scale used in chemical oceanography (for a review, see Zeebe and Wolf-Gladrow, 2001). A reasonable definition would be based on the activity of hydrogen ions:

\[ pH = - log_{10} \{H^+\} \]

Unfortunately, this definition is based on a quantity (activity) which cannot be measured directly. The NBS scale (named after the National Bureau of Standards, now NIST) is defined on a series of dilute buffer solutions such that:

\[ pH_{NBS} \approx - log_{10} \{H^+\} \]

Chemical oceanographers rarely use NBS buffer solutions because their ionic strength (\(\approx\) 0.1) is much lower than seawater (\(\approx\) 0.7). Hansson (1973) proposed a “total” scale based on a buffer solution of artificial seawater. On this scale, the pH is defined as:

\[ pH_{T} = - log_{10} \left([H^+] + [HSO4^-]\right) \]

Two other scales that are commonly used are also based on ion concentrations: the free scale (f) and the seawater scale (SWS)

\[ pH_{F} = - log_{10} \left([H^+]\right) \]

\[ pH_{T} = - log_{10} \left([H^+] + [HSO_4^-] + [HF]\right) \]

What pH scale does PHREEQC use?

In light of all these scales, we might have a few questions about how PHREEQC handles pH. Let’s set up the following experiment. First, we will use the SOLUTION block to specify pH for a seawater-like solution. Then we will use the SELECTED_OUTPUT block to return activities and concentrations of H\(^+\) and HSO\(_4^-\). We can then use functions like log10 to calculate the pH using the definitions for each scale.

Note: at this time, the pitzer database does not include F as a species, so we will not use the seawater scale.

Example: working with pH scales

Using the following ion composition from seawater, write a script that depicts pH on the NBS, free, and total scales. Which scale is the default in PHREEQC?

## [1] "the pH is 8.3 on the NBS scale"
## [1] "the pH is 8.2082993772547 on the free scale"
## [1] "the pH is 8.09658386317084 on the total scale"

We can see that PHREEQC defines based on the activity of hydrogen, which is best matched by the NBS definition! Since the NBS definition is almost never used for concentrated solutions, we will need to add scale converters into most of our scripts. Luckily, the above example shows that we can covert scales with just a few extra lines.

Module 3: Multiple calculations, loops, and plots

Simple loops

Previously we have used the paste() function to convert numbers to a text format PHREEQC can read. The next step would be to pass many numbers to PHREEQC to run a series of calculations. Unlike programs like CO2sys, PHREEQC will not take vectorized inputs directly. Instead, we can use a loop to run multiple calculations one after the other. A simple loop may look something like this:

Example: Multiple calculations

Consider the following solution at 25C

Ca = 10
DIC = 2.0

Show how carbonate ion concentration, [CO3], varies over a pH range of 4-12

## [1] 7.836307e-12 5.240250e-08 1.523094e-05 8.818521e-04 1.976510e-03

sing a loop requires some planning before we pass the numbers to PHREEQC. The syntax pH<-c() saves the numbers in parentheses as a vector. A vector is a one-dimensional group of elements akin to a row or column in a spreadsheet. In R, the entries in a vector can be called using square brackets. For example, typing pH[3] returns the third entry in pH:

## [1] 8

Our loop will call PHREEQC once for every entry in pH. But what happens if we add more entries? We can ask R to count the number of entries in a vector using the size() command. We then save the number of entries as a variable, n, and then tell the loop to run n times. This is much more useful than it first appears. Say I have 20 pH measurements after one field season and then collect 20 more. R will have no problem if I update my data to include more entries as it checks the size before each run.

We have also used another new function, rep, to make an n-length vector,CO3, and set all entries as zero. During each step the loop then overwrites an empty entry with the calculated value for [CO\(_3^{2-}\)]. The strategy of “preallocating” space to store the output of a loop increases computational efficiency. The details are beyond the scope of our course, but you can start reading about the topic here.

Inside the loop we have a new variable, step, which takes on a different value during each pass of the loop: step = 1 during the first pass, step = 2 during the second, and step = n on the nth (final) pass. The syntax pH[step] and CO3[step] ensure that each pass of the loop references the correct entry in the input and output vectors.

The seq() function

Now we can clearly see the advantage of using PHREEQC in a loop: if we can teach the computer to do something once, it can do it a hundred or a thousand times. We are very close to being creating graphs such as Bjerrum plots! Still, we might want to run tens or hundreds of calculations over a pH range to get a plot that is nice and ‘filled out’. This is a common task, and most programming languages like R have built-in commands such as seq() that create vectors with evenly-spaced elements.

Example: Multiple calculations

Consider the following solution at 25C

Ca = 10
DIC = 2.0

Show how carbonate ion concentration, [CO3], varies over a pH range of 4-12

##  [1]  1  2  3  4  5  6  7  8  9 10

Sometimes it’s more helpful to assign a number points, m, and calculate the spacing (\(\Delta\)) algebraically:

\[ \Delta = \frac{max - min}{m -1}\]

The following script implements the seq() command to make an evenly-spaced vector of pH values.

##  [1] 7.836307e-12 1.657802e-11 3.503855e-11 7.395038e-11 1.557478e-10
##  [6] 3.270235e-10 6.836559e-10 1.420382e-09 2.925579e-09 5.954643e-09
## [11] 1.192846e-08 2.340690e-08 4.476546e-08 8.305187e-08 1.489608e-07
## [16] 2.579325e-07 4.316061e-07 7.001299e-07 1.105965e-06 1.709818e-06
## [21] 2.599405e-06 3.902035e-06 5.802522e-06 8.568626e-06 1.258694e-05
## [26] 1.841296e-05 2.683955e-05 3.898754e-05 5.641935e-05 8.126848e-05
## [31] 1.163599e-04 1.652631e-04 2.321687e-04 3.214250e-04 4.365640e-04
## [36] 5.787997e-04 7.453740e-04 9.286441e-04 1.116911e-03 1.297153e-03
## [41] 1.458452e-03 1.594331e-03 1.703103e-03 1.786687e-03 1.848925e-03
## [46] 1.894191e-03 1.926553e-03 1.949404e-03 1.965395e-03 1.976510e-03

Basic plots in R

The final hurdle is to graph our results. The plot() command can do this given only the vectors that belong on the x and y axes:

Example: Multiple calculations

Consider the following solution at 25C

Ca = 10
DIC = 2.0

Show how carbonate ion concentration, [CO3], varies over a pH range of 4-12

Note that we have used a log transform on the data as it varies by many orders of magnitude.

The plot function has many additional features that can modify features such as axes, e.g.,

Lastly, there are many more sophisticated graphics packages in R such as ggplot2 and the plotly library which has been imported from python. We will not cover these explicitly, but the online communities for them are amazing! I highly recommend you give them a try and just google the correct syntax.

Module 4: Making Bjerrum plots

How accurate is the Pitzer database?

By now we have enough programming tools to tackle bigger scientific questions. The first thing we should do is test the performance of the Pitzer database in PHREEQC. Consider the following relationship between stochiometric-based and activity-based definitions for dissociation constants, e.g., K\(_2\):

\[ K_2^* = K_2 \frac{\gamma_{ HCO_3^-}}{\gamma_{H^+}\gamma_{CO_3^{2-}}}\]

where K\(_2^*\) is the stochiometric dissociation constant, K\(_2\) is defined for an infinitely dilute solution at a fixed temperature and pressure, and \(\gamma_a\) denotes the activity of species a. The pitzer database in PHREEQC uses activites (right-hand side) while texts such as Zeebe and Wolf-gladrow (2001) use the stochoimentric definitions (left hand side). If we render our PHREEQC results as a Bjerrum plot, we can spot check our work by comparing it to tabulated values of pK\(_1^*\) and pK\(_2^*\) in well-defined solutions such as modern seawater. If our results agree, then we have a program than can easily calculate the carbon speciation in other, more exotic solutions such as lakes.

The Bjerrum Plot

The following code produces a fully functional Bjerrum plot. The plotly package lets us add hovertext displaying pH and concentration of individual species in \(\mu\)molar.

####################### Example 1: Inputs for DIC and pH ##########################

  remove(list= ls())                     # Clear the workspace from the previous run

  library(phreeqc)                       # Load the PHREEQC library
  library(plotly)

  phrLoadDatabaseString(pitzer.dat)      # Use the pitzer database

  pH_min          =         4            # minimum value for the pH axis [Free scale]
  pH_max          =      12.4            # maximum value for the pH axis [Free scale]
  n               =      1000            # number of pH nodes 

# Use the 'seq()' command to generate a vector of pH inputs

  pH_range <- seq( from = pH_min, to = pH_max, by = (pH_max - pH_min)/(n-1))

# Create empty vectors to store CO2, HCO3, and CO3 values

  CO2            <-   rep(0, n)
  HCO3           <-   rep(0, n)
  CO3_free       <-   rep(0, n)
  Mg_CO3         <-   rep(0, n)
  CO3_all        <-   rep(0, n)
  H_free         <-   rep(0, n)
  HSO4           <-   rep(0, n)

################################# PHREEQC loop ####################################  

  for (loop in 1:n)
  {
    step = loop

  input <- c(                            # Text string defining composition, units, 
                                         # and temperature.
             '  SOLUTION              '                                             ,   
             '  units         mmol/kgw'                                             ,
             '  temp                25', # temperature in degrees C                
             '  C(4)               2.0'                                             ,
       paste('  pH    ',pH_range[step], sep = ' ')                                  ,
             '  Ca                10.7',                                            
             '  Mg                  53'                                             ,
             '  Na                 469'                                             ,
             '  S(6)                28'                                             , 
             '  K                 10.2'                                             ,
             '  Cl                 546'                                             ,
  
             '  SELECTED_OUTPUT       '                                             ,
             ' -molalities CO2  HCO3- '                                             ,
             '  CO3-2 MgCO3 H+  HSO4- '                                             )

# Run input for current step and save species concentrations. Note that outputs
# are saved as log10 concentrations as this is what we want on the y-axis

  phrRunString(input)                    
  output <- phrGetSelectedOutput()       

  CO2[step]      <-   output$n1$m_CO2.mol.kgw.
  HCO3[step]     <-   output$n1$m_HCO3..mol.kgw.
  CO3_free[step] <-   output$n1$m_CO3.2.mol.kgw.
  Mg_CO3[step]   <-   output$n1$m_MgCO3.mol.kgw.    
  CO3_all[step]  <-   CO3_free[step] + Mg_CO3[step]
  H_free[step]   <-   output$n1$m_H..mol.kgw.
  HSO4[step]     <-   output$n1$m_HSO4..mol.kgw.
  }

################################ PHREEQC output ###################################   

# Convert pH scales and merge results into a data frame

  pH_total       <- -log10(H_free + HSO4)
  log_CO2        <- log10(CO2)
  log_HCO3       <- log10(HCO3)
  log_CO3_free   <- log10(CO3_free)
  log_Mg_CO3     <- log10(Mg_CO3)
  log_CO3_all    <- log10(CO3_all)
  
  
  DIC      <- CO2+HCO3+CO3_all
  log_DIC  <- log10(DIC)

  results  <- data.frame(pH_total, CO2, HCO3, CO3_all)

# Label plot and axes

  x      <- list(title = " pH (total scale)")
  y      <- list(title = "log concentration (moles)")
  title  <- list(text  = "Bjerrum plot for seawater at 25C", x = .45, y = .9)
  
# Set formatting for plot
  
  fig <- plot_ly(results, x = ~pH_total)
  fig <- fig %>% add_trace(y = ~log_CO2, name = 'CO2', mode = 'line', type = 'scatter',
                           line = list(width = 4),
                           text = ~CO2*10^6, hovertemplate = paste('%{text: .1f} umol'))
  fig <- fig %>% add_trace(y = ~log_HCO3, name = 'HCO3', mode = 'line',type = 'scatter',
                           line = list(width = 4),
                           text = ~HCO3*10^6, hovertemplate = paste('%{text: .0f} umol'))
  fig <- fig %>% add_trace(y = ~log_CO3_all, name = 'CO3 (total)', mode = 'line',type = 'scatter',
                           line = list(width = 4),
                           text = ~CO3_all*10^6, hovertemplate = paste('%{text: .0f} umol'))
  fig <- fig %>% add_trace(y = ~log_DIC, name = 'DIC', mode = 'line',type = 'scatter',
                           line = list(dash = 'dash', width = 3),
                           text = ~DIC*10^6, hovertemplate = paste('%{text: .0f} umol'))
  
  fig <- fig %>% layout(yaxis = list(range = c(-5, -2)))
  fig <- fig %>% layout(hovermode = "x unified", yaxis = list(hoverformat = '.2f'), 
                        xaxis = list(hoverformat = '.2f'), 
                        hoverlabel = list(font=list(size=20)))
  fig <- fig %>% layout(xaxis = x, yaxis = y, title = title, font = list(size = 20),
                        legend = list(x = 1, y = 0.5))
  
# Display the plot as an HTML widget
  
  fig

Figure 1: an interactive Bjerrum plot for 2 mmol total in carbon in seawater at 25C. Clicking and draging over an area of the plot will zoom in; double clicking resets the axes. Individual curves can be turned on and off by clicking on them in the legend.

Recall that the crossover points where [CO\(_2\)(aq)] = [HCO\(_3^-\)] and [HCO\(_3^-\)] = [CO\(_3^{2-}\)] occur at pK\(_1^*\) and pK\(_2^*\) respectively. Use the hovertext to read the pH values at the crossover points and compare them to the appropriate pK\(_1^*\) and pK\(_2^*\) values on the tables below.

T(\(^\circ\)C) Composition P(atm) pK\(_1^*\) pK\(_2^*\) Source
25 freshwater 1 6.35 10.33 Usdowski (1982)
25 seawater 1 5.86 8.92 DOE (1994)
0 seawater 1 6.11 9.38 DOE (1994)
Source pH\(_{SWS}\) pCO\(_2\) (\(\mu\)atm) CO\(_2\)(aq) (\(\mu\)mol) HCO\(_3^-\) (\(\mu\)mol) CO\(_3^{2-}\) (\(\mu\)mol)
Roy 8.08 354 10.0 1735 255
Hasson 8.10 343 9.7 1739 251
Merbach 8.11 327 9.3 1742 249